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WHAT IS CLAIMED IS: 

1. A method of implementing a discrete cosine transform (DCT) in a graphics processing 
unit (GPU), comprising: 

separating an image into blocks of pixels; 
for each block of pixels, in parallel, 

multiplying a column or row of pixels with a predetermined matrix to 
generate a corresponding set of output pixels; 

determining sets of scanlines based on the sets of output pixels; and 
for each set of scanlines, sampling at least a portion of the pixels 
comprised within the scanlines and pixels relative to the scanlines, and multiplying the 
sampled pixels with a row or column of the predetermined matrix. 

2. The method of claim 1, wherein the multiplying, determining, and sampling are 
performed in the GPU. 

3. The method of claim 1, wherein each corresponding set of output pixels corresponds 
to a textured line across the pixels in the blocks of pixels. 

4. The method of claim 1, wherein sampling the pixels comprised within the scanlines 
comprises using a separate shader for each set of scanlines. 

5. The method of claim 4, further comprising defining an array of coordinate offsets to 
neighboring pixels, wherein the shader accesses the pixels in the scanlines using the 
offset array. 

6. The method of claim 4, wherein the same shader can be used for each pixel in a 
scanline. 

7. A method of processing pixels, comprising: 

separating an image into blocks of pixels; 
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creating a polyline of pixels for each column or row in each block of pixels; and 
creating a line for each row or column in each block of pixels, wherein the rows 
or columns correspond to the polylines created for each column or row. 

8. The method of claim 7, further comprising: 

creating a polyline of pixels for each row or column in each block of pixels; and 
creating a line for each column or row in each block of pixels, wherein the rows 
or columns correspond to the polylines created for each row or column. 

9. The method of claim 7, further comprising: 

determining sets of scanlines based on the lines created for each row or column in 
each block of pixels; and 

for each set of scanlines, sampling the pixels comprised within the scanlines and 
multiplying the sampled pixels with a row or column of a predetermined matrix. 

10. The method of claim 7, wherein the steps of creating are performed in a graphics 
processing unit (GPU). 

11. A method of processing pixels, comprising: 

separating an image into blocks of pixels; 

determining a polyline of pixels for each column or row in each block of pixels; 
for each pixel in the polyline, 

sampling at least a portion of the other pixels in the corresponding column 
or row that lies alone the polyline and pixels relative to the column or row; 

multiplying each of the other pixels by a discrete cosine transform (DCT) 
coefficient from a predetermined matrix to generate resultant values; and 

adding the resultant values together to generate a resulting value. 

12. The method of claim 11, further comprising biasing and scaling at least one of the 
polyline of pixels, the resultant values, and each resulting value for each pixel. 
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13. A method of processing pixels, comprising: 

separating an image into blocks of pixels; 

for each column in a block of pixels, setting up a shader and rendering a scanline; 

and 

for each row in a block of pixels, setting up a shader and rendering a column. 

14. The method of claim 13, wherein setting up the shaders and rendering are performed 
in the GPU. 

15. A system to program a graphics processing unit (GPU) to implement a discrete 
cosine transform (DCT). 

16. The system of claim 15, wherein the GPU is adapted to receive blocks of pixels that 
an image has been separated into, and process each block of pixels, in parallel, by 

multiplying a column or row of pixels of an image with a predetermined matrix to 

generate a corresponding set of output pixels; 

determining sets of scanlines based on the sets of output pixels; and 

for each set of scanlines, sampling the pixels comprised within the scanlines and 

multiplying the sampled pixels with a row or column of the predetermined matrix. 

17. The system of claim 16, further comprising a central processing unit (CPU) coupled 
to the GPU by a system bus, the CPU capable of separating the image into the blocks of 
pixels. 

18. The system of claim 16, wherein each corresponding set of output pixels corresponds 
to a textured line across the pixels in the blocks of pixels. 

19. The system of claim 16, wherein the GPU comprises a separate shader for sampling 
the pixels comprised within each set of the scanlines. 



MSFT-3485/307558.01 



20 



PATENT 



20. The system of claim 19, wherein the GPU defines an array of coordinate offsets to 
neighboring pixels, wherein the shader accesses the pixels in the scanlines using the 
offset array. 

21. The system of claim 19, wherein the same shader can be used for each pixel in a 
scanline. 



